Detection and ranging based on a single monoscopic frame

ABSTRACT

One or more stereoscopic images are generated based on a single monoscopic image that may be obtained from a camera sensor. Each stereoscopic image includes a first digital image and a second digital image that, when viewed using any suitable stereoscopic viewing technique, result in a user or software program receiving a three-dimensional effect with respect to the elements included in the stereoscopic images. The monoscopic image may depict a geographic setting of a particular geographic location and the resulting stereoscopic image may provide a three-dimensional (3D) rendering of the geographic setting Use of the stereoscopic image helps a system obtain more accurate detection and ranging capabilities. The stereoscopic image may be any configuration of the first digital image (monoscopic) and the second digital image (monoscopic) that together may generate a 3D effect as perceived by a viewer or software program.

FIELD

The embodiments discussed in this disclosure relate to detection andranging based on a single monoscopic frame.

BACKGROUND

Detection and ranging applications have increased in demand with theadvent of autonomous and semi-autonomous vehicles. To help facilitateautonomous and semi-autonomous operation of vehicles, an ability todetect and range objects in an environment becomes increasingly helpful.Further considerations of autonomous and semi-autonomous operation ofvehicles may include safety, such as an ability to stay on a trajectoryof travel and avoid collisions with objects. Accordingly, some systemshave been developed for detection, ranging, and/or safety purposes.

For example, in some conventional systems, actual three-dimensionalcameras may be used to capture three-dimensional images. In otherconventional systems, a multitude of monoscopic cameras may be employedto create a three-dimensional effect when the combined images from allthe different cameras are stitched together. Such systems arevision-based, while other conventional systems may be signal-based. Forexample, RADAR uses radio signals and LIDAR uses laser signals to detectand range objects. However, each of the foregoing conventional systemsmay be deficient in one or more aspects. For example, three-dimensionalcameras are bulky and/or expensive, as is LIDAR technology or a host ofmonoscopic cameras like the approximately eight cameras used by someTESLA® autonomous/semi-autonomous vehicles. In addition to cost, size,and/or ease of implementation, technology limitations may also be afactor. For example, LIDAR may have limited usage at nighttime, incloudy weather, or at high altitudes (e.g., above 2000 meters).Additionally, for example, RADAR may not detect small objects or providea precise image of an object due to wavelength of the radio signals.

In addition, humans have a binocular vision system that uses two eyesspaced approximately two and a half inches (approximately 6.5centimeters) apart. Each eye sees the world from a slightly differentperspective. The brain uses the difference in these perspectives tocalculate or gauge distance. This binocular vision system is partlyresponsible for the ability to determine with relatively good accuracythe distance of an object. The relative distance of multiple objects ina field-of-view may also be determined with the help of binocularvision.

Three-dimensional (stereoscopic) imaging takes advantage of the depthperceived by binocular vision by presenting two images to a viewer whereone image is presented to one eye (e.g., the left eye) and the otherimage is presented to the other eye (e.g., the right eye). The imagespresented to the two eyes may include substantially the same elements,but the elements in the two images may be offset from each other tomimic the offsetting perspective that may be perceived by the viewer'seyes in everyday life. Therefore, the viewer may perceive depth in theelements depicted by the images.

SUMMARY

According to one or more embodiments of the present disclosure, one ormore stereoscopic images may be generated based on a single monoscopicimage that may be obtained from a camera sensor. The stereoscopic imagesmay each include the first digital image and the second digital imagethat, when viewed using any suitable stereoscopic viewing technique, mayresult in a user or software program receiving a three-dimensionaleffect with respect to the elements included in the stereoscopic images.The monoscopic image may depict a geographic setting of a particulargeographic location and the resulting stereoscopic image may provide athree-dimensional (3D) rendering of the geographic setting. Use of thestereoscopic image may help a system obtain more accurate detection andranging capabilities. Reference to a “stereoscopic image” in the presentdisclosure may refer to any configuration of the first digital image(monoscopic) and the second digital image (monoscopic) that together maygenerate a 3D effect as perceived by a viewer or software program.

BRIEF DESCRIPTION OF DRAWINGS

FIG. 1A illustrates an example system configured to generatestereoscopic (3D) images, according to some embodiments of the presentdisclosure.

FIG. 1B illustrates an example environment in which stereoscopic imagegeneration based on a single monoscopic frame occurs.

FIG. 2 illustrates an example flow diagram of a method for topologicaloptimization of graph-based models.

FIG. 3 illustrates an example system that may be used in topologicaloptimization of graph-based models.

FIG. 4 illustrates an example of a depth map generated by a detectionapplication and/or a stereoscopic image module.

FIG. 5 illustrates an example of a stereoscopic pair provided to agraph-based model for training purposes.

DETAILED DESCRIPTION

FIG. 1A illustrates an example system 100 configured to generatestereoscopic (3D) images, according to some embodiments of the presentdisclosure. The system 100 may include a stereoscopic image generationmodule 104 (referred to hereinafter as “stereoscopic image module 104”)configured to generate one or more stereoscopic images 108. Thestereoscopic image module 104 may include any suitable system,apparatus, or device configured to receive monoscopic images 102 and togenerate each of the stereoscopic images 108 based on two or more of themonoscopic images 102. For example, in some embodiments, thestereoscopic image module 104 may include software that includescomputer-executable instructions configured to cause a processor toperform operations for generating the stereoscopic images 108 based onthe monoscopic images 102.

In some embodiments, the monoscopic images 102 may include digitalimages obtained by a camera sensor that depict a setting. For example,the monoscopic images 102 may include digital images that depict anobject in the setting. In some embodiments, the object may be anyelement that is visually detectable, such as a tree, a pedestrian, aflying bird, an airplane, an airborne missile, a ship, a buoy, a riveror ocean, a curb, a traffic sign, traffic lines (e.g., double linesindicating a “no pass zone”), a mountain, a wall, a house, a firehydrant, a dog, or any other suitable object visually detectable by acamera sensor. In some embodiments, the stereoscopic image module 104may be configured to acquire the monoscopic images 102 via a detectionapplication communicatively coupled to the camera sensor. As referred toin the present disclosure, “detection application” is short for“detection and ranging application.”

In some embodiments, the stereoscopic image module 104 may be configuredto access the detection application (such as the detection application124 of FIG. 1B) via any suitable network such as the network 128 of FIG.1B to request the monoscopic images 102 from the detection application.In these or other embodiments, the detection application and associatedmonoscopic images 102 may be stored on a same device that may includethe stereoscopic image module 104. In these or other embodiments, thestereoscopic image module 104 may be configured to access the detectionapplication stored on the device to request the monoscopic images 102from a storage area of the device on which they may be stored.

Additionally or alternatively, the stereoscopic image module 104 may beincluded with the detection application in which the stereoscopic imagemodule 104 may obtain the monoscopic images 102 via the detectionapplication by accessing portions of the detection application thatcontrol obtaining the monoscopic images 102. In other embodiments, thestereoscopic image module 104 may be separate from the detectionapplication (e.g., as shown in FIG. 1B), but may be configured tointerface with the detection application to obtain the monoscopic images102.

The stereoscopic image module 104 may be configured to generate thestereoscopic images 108 as indicated below. To aid in explanation of theconcepts, the description is given with respect to generation of anexample stereoscopic image 120 (illustrated in FIG. 1B and describedbelow), which may be an example of one of the stereoscopic images 108 ofFIG. 1A. Further, the description is given with respect to generation ofthe stereoscopic image 120 based on an example first digital image 110and an example second digital image 112, which are illustrated in FIG.1B. The first digital image 110 and the second digital image 112 areexamples of monoscopic images that may be included with the monoscopicimages 102 of FIG. 1A.

FIG. 1B illustrates an example environment 105 in which stereoscopicimage generation based on a single monoscopic frame occurs. The elementsof FIG. 1B may be arranged according to one or more embodiments of thepresent disclosure. As illustrated, FIG. 1B includes: a machine 122having a detection application 124 and a computing system 126; a network128; and a stereoscopic image module 130 having a graph-based model 132and a computing system 134. Further illustrated are a setting 109, afirst digital image 110, a second digital image 112, a focal point 113,a camera 114, focal distances 115 a/115 b, an imaginary camera 116, anda displacement factor 118. In some embodiments, the stereoscopic imagemodule 130 may be the same as or similar to the stereoscopic imagemodule 104 described above in conjunction with FIG. 1A. Additionally oralternatively, the computing system 126 and the computing system 134 maybe the same as or similar to the system 300 described below inconjunction with FIG. 3.

In some embodiments, the setting 109 may include any geographicalsetting in which the camera 114 may capture an image. For example, thesetting 109 may include garages, driveways, streets, sidewalks, oceans,rivers, skies, forests, cities, villages, landing/launching areas suchas airport runways and flight decks, warehouses, stores, inventoryaisles, and any other suitable environment in which the machine 122 maydetect and range objects. Accordingly, when the camera 114 captures thefirst digital image 110, the first digital image 110 may include anyaspect and/or portion of the setting 109. Additionally or alternatively,the first digital image 110 may include the focal point 113 based on thefocal distance 115 a of the camera 114. In these or other embodiments,the focal distance 115 a to the focal point 113 may be a known constantbased on specifications of the camera 114.

In some embodiments, the camera 114 may be attached to the machine 122.In the present disclosure, reference to “machine” may refer to anydevice configured to store and/or execute computer code, e.g.,executable instructions of a software application. In some embodiments,the machine may be movable from a first geographic position (e.g.,“Point A”) to a second geographic position (e.g., “Point B”). In theseor other embodiments, the machine 122 may be autonomous orsemi-autonomous with respect to moving between geographic positions.Alternatively, the machine 122 may be human-operated between geographicpositions. Examples of a machine 122 may include robots, drones,rockets, space stations, self-driving cars/trucks, human-operatedcars/trucks, equipment (e.g., construction/maintenance equipment such asa backhoe, a street-sweeper, a steam roller, etc.), storage pods (e.g.,a transportable storage unit, etc.), or any other suitable deviceconfigured to move between geographic positions.

Additionally or alternatively, the machine may include a device that isstationary, and in some embodiments, fixed in position. For example, themachine may include an anti-missile device stationed at a military base,a security device fixed at a perimeter of a prison, a hoveringhelicopter, or any other suitable machine, whether temporarilystationary or permanently fixed in position. Additionally oralternatively, the machine may include a client device. Some examples ofthe client device may include a mobile phone, a smartphone, a tabletcomputer, a laptop computer, a desktop computer, a set-top box, avirtual-reality device, a wearable device, a connected device, anymobility device that has an operating system, a satellite, etc.

In these or other embodiments, the detection and ranging capabilities ofthe machine 122 enabled by the present application may be advantageousin any variety of fields or industries, including, for example:commercial/industrial purposes, manufacturing purposes, militarypurposes (e.g., Army, Navy, National Guard, Marines, Air Force, andSpace Force), government agency purposes (e.g., Federal Bureau ofInvestigations, Central Intelligence Agency, and National TransportationSafety Board), etc.

Additionally or alternatively, the machine 122 may detect and/or rangealong a trajectory. The trajectory may include any path of travel and/ora surrounding area for the machine 122, whether in air, on land, inspace, or on water. In these or other embodiments, the camera 114 may beconfigured to capture in the first digital image 110 a portion of thetrajectory of the machine 122, e.g., the portion of the trajectorynearest to the machine 122, another portion of the trajectory farthestaway from the machine 122, or another portion not necessarily part ofthe trajectory of the machine 122. As an example, the camera 114 maycapture a portion of the trajectory up to about two meters away from themachine 122; up to about five meters away from the machine 122; up toabout twenty meters away from the machine 122; up to about fifty metersaway from the machine 122; up to about one hundred meters away from themachine 122; up to about two hundred meters away from the machine 122;up to about five hundred meters away from the machine 122; up to aboutone thousand meters away from the machine 122; up to about five thousandmeters away from the machine 122; etc. The advancement of cameratechnology (including camera lens technology) may continue to facilitateadvantages in imaging speed, resolution, measurement accuracy, and focaldistances.

In some embodiments, the first digital image 110 captured by the camera114 may be obtained by the detection application 124. For example, thedetection application 124 may request the first digital image 110 fromthe camera 114. Additionally or alternatively, the detection application124 may receive the first digital image 110 as sent from the camera 114.

In these or other embodiments, the stereoscopic image module 130 mayobtain the first digital image 110 from the detection application 124.For example, the stereoscopic image module 130 may request the firstdigital image 110 from the detection application 124. Additionally oralternatively, the stereoscopic image module 130 may receive the firstdigital image 110 as sent from the detection application 124. In theseor other embodiments, the stereoscopic image module 130 may obtain thefirst digital image 110 via the network 128, e.g., where thestereoscopic image module 130 is positioned remotely from the machine122 as shown in FIG. 1B, such as a remote server. The remote server maybe the same as or similar to the computing system 134. Additionally oralternatively, the remote server may include one or more computingdevices, such as a rackmount server, a router computer, a servercomputer, a personal computer, a mainframe computer, a laptop computer,a tablet computer, a desktop computer, a smartphone, cars, drones, arobot, any mobility device that has an operating system, etc.), datastores (e.g., hard disks, memories, databases), networks, softwarecomponents, and/or hardware components. In other embodiments, thestereoscopic image module 130 may obtain the first digital image 110without the network 128, e.g., where the stereoscopic image module 130is integrated with the machine 122 (e.g., not positioned at the remoteserver).

In some embodiments, the network 128 may be any network or configurationof networks configured to send and receive communications betweensystems and devices. In some embodiments, the network 128 may include aconventional type network, a wired or wireless network, and may havenumerous different configurations. Additionally or alternatively, thenetwork 128 may include any suitable topology, configuration orconfigurations including a star configuration, token ring configuration,or other configurations. The network 128 may include a local areanetwork (LAN), a wide area network (WAN) (e.g., the Internet), DECT ULE,and/or other interconnected data paths across which multiple devices maycommunicate. In some embodiments, the network 128 may include apeer-to-peer network. The network 128 may also be coupled to or includeportions of a telecommunications network that may enable communicationof data in a variety of different communication protocols. In someembodiments, the network 128 may include BlueTooth® communicationnetworks (e.g., MESH Bluetooth) and/or cellular communication networksfor sending and receiving data including via short messaging service(SMS), multimedia messaging service (MMS), hypertext transfer protocol(HTTP), direct data connection, wireless application protocol (WAP),e-mail, or the like. Further, the network 128 may include WiFi, NFC,LTE, LTE-Advanced, 1G, 2G, 3G, 4G, 5G, etc., ZigBee®, LoRA®—a wirelesstechnology developed to enable low data rate communications to be madeover long distances by sensors and actuators for machine to machinecommunication and internet of things (IoT) applications—wireless USB, orany other such wireless technology.

In some embodiments, after the first digital image 110 is obtained bythe stereoscopic image module 130, the stereoscopic image module 130 mayinput the first digital image 110 into the graph-based model 132. Asreferred to in the present disclosure, the term “graph-based model” mayinclude a deep neural network, a deep belief network, a recurrent neuralnetwork, or some other graph model such as a genetic programming modelor a tree-based or forest-based machine learning model. Thus, thegraph-based model 132 may include any artificial intelligence system orlearning-based mechanism, examples of which may include: perceptron,multilayer peceptron, feed forward, radial basis network, deep feedforward, recurrent neural network, long/short term memory, gatedrecurrent unit, auto encoder, variational auto encoder, denoising autoencoder, sparse auto encoder, any sequence-to-sequence model, shallowneural networks, markov chain, hopfield network, boltzmann machine,restricted boltzmann machine, deep belief network, deep convolutionalnetwork, convolutional neural network (e.g., VGG-16), deconvolutionalnetwork, deep convolutional inverse graphics network, modular neuralnetwork, generative adversarial network, liquid state machine, extremelearning machine, echo state network, recursive neural network, deepresidual network, kohonen network, support vector machine, neural turingmachine, etc.

In some embodiments, the graph-based model 132 may be trained togenerate (e.g., with help of the system 134) the second digital image112 based on input in the form of the first digital image 110. Thetraining of the graph-based model 132 is described later in thisdisclosure. In these or other embodiments, the second digital image 112may be configured to be an image of a same area or a similar area of thesetting 109. Thus, in some embodiments, the first digital image 110 andthe second digital image 112 may substantially overlap. In these orother embodiments, data may be discarded that corresponds to portionswhere the first digital image 110 the second digital image 112 do notoverlap. Additionally or alternatively, the second digital image 112 maybe generated as a monoscopic image that visually mimics what theimaginary camera 116 would image if the imaginary camera 116 were anactual camera like the camera 114. In these or other embodiments, theimaginary camera 116 is virtually positioned at a different positionfrom an actual position of the camera 114. Thus, in some embodiments, anobject imaged in the first digital image 110 may be imaged from a firstposition and/or at a first angle. Additionally or alternatively, theobject may be imaged in the second digital image 112 from a secondposition and/or at a second angle such that the second position and/orthe second angle are different from the first position and the firstangle, respectively. In this manner, the stereoscopic image 120 withperceptible depth may be generated using the first digital image 110captured by the camera 114 and the second digital image 112 generated bythe stereoscopic image module 130.

In these or other embodiments, the positional relationship of the camera114 relative to the imaginary camera 116 may include the displacementfactor 118. As referred to in the present disclosure, the displacementfactor 118 may include: an angle or orientation with respect to one ormore axes (e.g., roll, pitch, and yaw), an offset lateral distance oroffset vertical height, etc. In some embodiments, the displacementfactor 118 may be a known constant. Additionally or alternatively, thedisplacement factor 118 may be set at a value such that the stereoscopicimage 120 resulting from the second digital image 112 is of sufficientquality and accuracy. For example, the displacement factor 118 may beset at a value such that distance measurements based on the stereoscopicimage 120 are sufficiently accurate and/or fit a certain model.

In some embodiments, the stereoscopic image 120 may be used to generatea depth map. For example, the detection application 124 and/or thestereoscopic image module 130 may generate the depth map. An example ofa depth map is illustrated in FIG. 4. The depth map may include acorresponding pixel for each pixel in the stereoscopic image 120. Eachcorresponding pixel in the depth map may be representative of relativedistance data from the camera 114 for each respective pixel in thestereoscopic image 120. For example, a pixel in the depth map having acertain shade of purple or gray-scale may correspond to a particularrelative distance, which is not an actual distance value. Thus, in someembodiments, a pixel in a first depth map and a pixel in a second depthmap may include a same shade of color or gray-scale, yet have differentactual distance values (e.g., even orders of magnitude different actualdistance values). In this manner, color or gray-scale in the generateddepth map does not represent an actual distance value for a pixel;rather, the color or gray-scale of a pixel in the generated depth mapmay represent a distance value relative to adjacent pixels.

In some embodiments, a subset of pixels of a total amount of pixels inthe depth map may be associated with an object. For example, thedetection application 124 and/or the stereoscopic image module 130 maydetermine that the subset of pixels in the depth map is indicative ofthe object. In this manner, a presence of an object may be preliminarilyidentified or detected, though not necessarily ranged. To range thedetected object, a portion of the subset of pixels associated with theobject may be analyzed. In some embodiments, the portion of the subsetof pixels may be analyzed as opposed to the entire subset of pixelsassociated with the object to reduce computational overhead, increaseranging speed, etc. For example, every pixel associated with apedestrian (e.g., the feet, legs, torso, neck, and head) need not all beranged. Rather, one or more portions of pixels associated with thepedestrian may be considered as representative of where the pedestrianis located relative to the camera 114 for ranging purposes. In these orother embodiments, the subset of pixels associated with the object maybe averaged, segmented, or otherwise simplified to a portion of thesubset of pixels. Additionally or alternatively, a resolution of one orboth of the stereoscopic image 120 and the depth map may be temporarilydecreased (and later restored to original resolution). In this manner,the portion of the subset of pixels may include relative distance datasufficiently representative of the object.

In some embodiments, the relative distance data for the object may beconverted to an actual distance value (e.g., in inches, feet, meters,kilometers, etc.). To convert the relative distance data based on thedepth map to an actual distance value to an object, a pre-determinedrelationship between the relative distance data, the focal point 113 ofthe first digital image 110 and the second digital image 112, thedisplacement factor 118 between the camera 114 and the imaginary camera116, and/or a correction curve that compensates for an offset indistance measurements based on perceived depth in the stereoscopicimage. In these or other embodiments, as the distance from the camera114 increases, the relative distance data in the depth map may decreasein accuracy. Therefore, once the actual distance data is converted fromthe relative distance data, an amount of offset from the actual distancedata may be graphed or fitted to a curve as a function of actualdistance. Thus, in some embodiments, a curve of correction values may beimplemented to correct an offset from the actual distance data.

In some embodiments, the graph-based model 132 may be trained togenerate the second digital image 112 based on a single monoscopic imagesuch as the first digital image 110 for subsequent generation of thestereoscopic image 120. To train the graph-based model 132, stereoscopicpair images may be provided to the graph-based model 132. Thestereoscopic pair images may include a first monoscopic image and asecond monoscopic image. An example of a stereoscopic pair provided tothe graph-based model 132 for training purposes is illustrated in FIG.5. In these or other embodiments, the first monoscopic image and thesecond monoscopic image may include images taken of any same or similarsetting, but from different positions and/or angles. In this manner, thefirst monoscopic image and the second monoscopic image taken togethermay for a stereoscopic pair with perceivable depth. Additionally oralternatively, the first monoscopic image and the second monoscopicimage may include a setting 109 of any type, nature, location, orsubject. Some stereoscopic pair images may be related by type, nature,location, or subject; however, diversity among the stereoscopic pairimages in addition to increased quantity may help improve a trainingquality or capability of the graph-based model 132 to generate thesecond digital image 112 and the stereoscopic image 120 of sufficientquality and accuracy.

In some embodiments, the training of the graph-based model 132 may occuron a server side, e.g., at the stereoscopic image module 130 whenpositioned remotely from the machine 122. Additionally or alternatively,the training of the graph-based model 132 may be a one-time process,after which generation of the second digital image 112 and stereoscopicimage 120 may be enabled. In other embodiments, the training of thegraph-based model 132 may occur on an as-needed basis, a rolling basis(e.g., continually), or on an interval basis (e.g., a predeterminedschedule). As an example of an as-needed basis, inaccuracies or safetythreats may come to light, e.g., in the event of a safety violation oraccident. In such a case, additional training focused on inaccuracies orsafety threats may be provided to the graph-based model 132.Additionally or alternatively, one or more aspects of training of thegraph-based model 132 may occur at the machine 122, e.g., via thedetection application 124. As an example, feedback may be received atthe graph-based model 132 from: the detection application 124 via themachine 122, a user of the machine 122 via the machine 122, athird-party such as a law enforcement officer, etc.

Modifications, additions, or omissions may be made to the environment105 without departing from the scope of the present disclosure. Forexample, the environment 105 may include other elements than thosespecifically listed. Additionally, the environment 105 may be includedin any number of different systems or devices.

FIG. 2 illustrates an example flow diagram of a method 200 fortopological optimization of graph-based models. The method 200 may bearranged in accordance with at least one embodiment described in thepresent disclosure. The method 200 may be performed, in whole or inpart, in some embodiments by the software system and/or a processingsystem, such as a system 300 described below in conjunction with FIG. 3.In these and other embodiments, some or all of the steps of the method200 may be performed based on the execution of instructions stored onone or more non-transitory computer-readable media. Although illustratedas discrete blocks, various blocks may be divided into additionalblocks, combined into fewer blocks, or eliminated, depending on thedesired implementation.

The method 200 may begin at block 205 at which a first digital image isobtained via one or both of a detection application and a camera sensor.The first digital image may be a monoscopic image that depicts a settingfrom a first position of the camera sensor communicatively coupled tothe detection application. In some embodiments, the first digital imagemay include a trajectory of a machine.

At block 210, a second digital image may be generated based on the firstdigital image. The second digital image may be a monoscopic image thatdepicts a setting from a second position different from the firstposition. In these or other embodiments, the second digital image is notan image captured by a camera, such as the camera capturing the firstdigital image of block 205.

At block 215, a stereoscopic image of the setting may be generated. Thestereoscopic image may include the first digital image and the seconddigital image. In these or other embodiments, the stereoscopic image maybe an image from which detection and ranging determinations may bebased.

One skilled in the art will appreciate that, for this and other methodsdisclosed in this disclosure, the blocks of the methods may beimplemented in differing order. Furthermore, the blocks are onlyprovided as examples, and some of the blocks may be optional, combinedinto fewer blocks, or expanded into additional blocks.

For example, in some embodiments, one or more additional blocks may beincluded in the method 200 that include obtaining a plurality ofstereoscopic pair images that each includes a first monoscopic image anda second monoscopic image; sending the plurality of stereoscopic pairimages as inputs into a graph-based model. In this manner, thegraph-based model may be trained to know how to generate a seconddigital image of block 210 based on the first digital image forsubsequent generation of the stereoscopic image of block 215.

Additionally or alternatively, one or more additional blocks may beincluded in the method 200 that include sending the first digital imageas an input into the graph-based model, wherein the second digital imageis output from the graph-based model based on one or both of theplurality of stereoscopic pair images and the first digital image inputinto the graph-based model.

Additionally or alternatively, one or more additional blocks may beincluded in the method 200 that include generating a depth map thatincludes a corresponding pixel for each pixel in the stereoscopic image,each corresponding pixel in the depth map representative of relativedistance data from the camera sensor for each respective pixel in thestereoscopic image.

Additionally or alternatively, one or more additional blocks may beincluded in the method 200 that include associating a subset of pixelsof a total amount of pixels in the depth map as indicative of an object;and based on a portion of the subset of pixels in the depth mapassociated with the object, obtaining an actual distance from the camerasensor to the object in the stereoscopic image using: the relativedistance data of the portion associated with object; a focal point ofthe first digital image and the second digital image; and a displacementfactor between the first digital image and the second digital image. Insome embodiments, obtaining an actual distance to the object may includedetermining a correction value that compensates for an offset indistance measurements based on perceived depth in the stereoscopicimage.

Additionally or alternatively, one or more additional blocks may beincluded in the method 200 that include sending a warning forpresentation via the detection application when the actual distance tothe object satisfies a first threshold distance; and/or causing, via thedetection application, a machine communicatively coupled to thedetection application to perform a corrective action when the actualdistance to the object satisfies a second threshold distance. In someembodiments, the first threshold may distance and the second thresholddistance may be the same, while in other embodiments, differentdistances to the detected object. Additionally or alternatively, thefirst threshold distance and/or the second threshold distance may varydepending on any of a myriad of factors. For example, contributingfactors affecting the first and second threshold differences mayinclude: a speed of the machine and/or object, a trajectory of themachine and/or object, regulating rules or laws, a cost/benefitanalysis, a risk predictive analysis, or any other suitable type offactor in which a threshold distance between the machine and a detectedobject may be merited.

In some embodiments, the warning for presentation (e.g., at a display)via the detection application may include a visual warning signal and/oran audible warning signal. Additionally or alternatively, the detectionapplication may cause the machine to perform a corrective action thatincludes stopping the machine, slowing the machine, swerving themachine, dropping/raising an altitude of the machine, an avoidingmaneuver, or any other suitable type of corrective action to mitigatedamage to the machine and the object and/or prevent contact between themachine and the object.

Additionally or alternatively, one or more additional blocks may beincluded in the method 200 that include determining a presence of anobject within the stereoscopic image; and based on image recognitionprocessing of the object via a graph-based model, classifying theobject. In some embodiments, determining a presence of an object withinthe stereoscopic image may include an analysis of pixels within thestereoscopic image and/or within the depth map. For example, if a groupof pixels form an example shape or comprise a particular color orgray-scale, the presence of an object may be inferred. In these or otherembodiments, recognition of the object may be a separate step.

In some embodiments, image recognition may include image recognitiontraining of a graph-based model. For example, the graph-based model maybe fed input data (e.g., images of objects), and output of thegraph-based model (e.g., guesses) may be compared to expected resultssuch as predetermined or human designated labels. With additional cyclesthrough the input data, weights, biases, and other parameters in thegraph-based model may be modified to decrease the error rate of theguesses. For example, weights in the graph-based model may be adjustedso that the guesses better match the predetermined or human designatedlabels of the images of objects.

In these or other embodiments, the input data fed to the graph-basedmodel for training purposes may include images of a host of differentobjects. Hundreds, thousands, or millions of images of objects may beprovided to the graph-based model. Additionally or alternatively, theimages of the objects provided to the graph-based model may includelabels that correspond to one or more features, pixels, boundaries, orany other detectable aspect of the objects.

In these or other embodiments, additional or alternative imagerecognition techniques may be used with the graph-based model toclassify the objects. Examples may include using: greyscale; RGB (red,green, and blue) values ranging from, for example, zero to 255;pre-processing techniques (e.g., image cropping/flipping/anglemanipulation, adjustment of image hue, contrast and saturation, etc.);testing subsets or small batch sizes of data as opposed to entiredatasets; and max-pooling to reduce the dimensions of an image by takingthe maximum pixel value of a grid.

FIG. 3 illustrates an example system 300 that may be used in topologicaloptimization of graph-based models. The system 300 may be arranged inaccordance with at least one embodiment described in the presentdisclosure. The system 300 may include a processor 310, memory 312, acommunication unit 316, a display 318, a user interface unit 320, and aperipheral device 322, which all may be communicatively coupled. In someembodiments, the system 300 may be part of any of the systems or devicesdescribed in this disclosure.

Generally, the processor 310 may include any suitable special-purpose orgeneral-purpose computer, computing entity, or processing deviceincluding various computer hardware or software modules and may beconfigured to execute instructions stored on any applicablecomputer-readable storage media. For example, the processor 310 mayinclude a microprocessor, a microcontroller, a digital signal processor(DSP), an application-specific integrated circuit (ASIC), aField-Programmable Gate Array (FPGA), or any other digital or analogcircuitry configured to interpret and/or to execute program instructionsand/or to process data.

Although illustrated as a single processor in FIG. 3, it is understoodthat the processor 310 may include any number of processors distributedacross any number of networks or physical locations that are configuredto perform individually or collectively any number of operationsdescribed in this disclosure. In some embodiments, the processor 310 mayinterpret and/or execute program instructions and/or process data storedin the memory 312. In some embodiments, the processor 310 may executethe program instructions stored in the memory 312.

For example, in some embodiments, the processor 310 may execute programinstructions stored in the memory 312 that are related detection andranging based on a single monoscopic frame. In these and otherembodiments, instructions may be used to perform one or more operationsor functions described in the present disclosure.

The memory 312 may include computer-readable storage media or one ormore computer-readable storage mediums for carrying or havingcomputer-executable instructions or data structures stored thereon. Suchcomputer-readable storage media may be any available media that may beaccessed by a general-purpose or special-purpose computer, such as theprocessor 310. By way of example, and not limitation, suchcomputer-readable storage media may include non-transitorycomputer-readable storage media including Random Access Memory (RAM),Read-Only Memory (ROM), Electrically Erasable Programmable Read-OnlyMemory (EEPROM), Compact Disc Read-Only Memory (CD-ROM) or other opticaldisk storage, magnetic disk storage or other magnetic storage devices,flash memory devices (e.g., solid state memory devices), or any otherstorage medium which may be used to carry or store particular programcode in the form of computer-executable instructions or data structuresand which may be accessed by a general-purpose or special-purposecomputer. Combinations of the above may also be included within thescope of computer-readable storage media. Computer-executableinstructions may include, for example, instructions and data configuredto cause the processor 310 to perform a certain operation or group ofoperations as described in this disclosure. In these and otherembodiments, the term “non-transitory” as explained in the presentdisclosure should be construed to exclude only those types of transitorymedia that were found to fall outside the scope of patentable subjectmatter in the Federal Circuit decision of In re Nuijten, 500 F.3d 1346(Fed. Cir. 2007). Combinations of the above may also be included withinthe scope of computer-readable media.

The communication unit 316 may include any component, device, system, orcombination thereof that is configured to transmit or receiveinformation over a network. In some embodiments, the communication unit316 may communicate with other devices at other locations, the samelocation, or even other components within the same system. For example,the communication unit 316 may include a modem, a network card (wirelessor wired), an infrared communication device, a wireless communicationdevice (such as an antenna), and/or chipset (such as a Bluetooth device,an 802.6 device (e.g., Metropolitan Area Network (MAN)), a Wi-Fi device,a WiMax device, cellular communication facilities, etc.), and/or thelike. The communication unit 316 may permit data to be exchanged with anetwork and/or any other devices or systems described in the presentdisclosure.

The display 318 may be configured as one or more displays, like an LCD,LED, or other type of display. For example, the display 318 may beconfigured to present topologies, indicate mutations to topologies,indicate warning notices, show validation performance improvementvalues, display weights, biases, etc., and other data as directed by theprocessor 310.

The user interface unit 320 may include any device to allow a user tointerface with the system 300. For example, the user interface unit 320may include a mouse, a track pad, a keyboard, buttons, and/or atouchscreen, among other devices. The user interface unit 320 mayreceive input from a user and provide the input to the processor 310. Insome embodiments, the user interface unit 320 and the display 318 may becombined.

The peripheral devices 322 may include one or more devices. For example,the peripheral devices may include a sensor, a microphone, and/or aspeaker, among other peripheral devices.

Modifications, additions, or omissions may be made to the system 300without departing from the scope of the present disclosure. For example,in some embodiments, the system 300 may include any number of othercomponents that may not be explicitly illustrated or described. Further,depending on certain implementations, the system 300 may not include oneor more of the components illustrated and described.

In accordance with common practice, the various features illustrated inthe drawings may not be drawn to scale. The illustrations presented inthe present disclosure are not meant to be actual views of anyparticular apparatus (e.g., device, system, etc.) or method, but aremerely idealized representations that are employed to describe variousembodiments of the disclosure. Accordingly, the dimensions of thevarious features may be arbitrarily expanded or reduced for clarity. Inaddition, some of the drawings may be simplified for clarity. Thus, thedrawings may not depict all of the components of a given apparatus(e.g., device) or all operations of a particular method.

Terms used herein and especially in the appended claims (e.g., bodies ofthe appended claims) are generally intended as “open” terms (e.g., theterm “including” should be interpreted as “including, but not limitedto,” the term “having” should be interpreted as “having at least,” theterm “includes” should be interpreted as “includes, but is not limitedto,” etc.).

Additionally, if a specific number of an introduced claim recitation isintended, such an intent will be explicitly recited in the claim, and inthe absence of such recitation no such intent is present. For example,as an aid to understanding, the following appended claims may containusage of the introductory phrases “at least one” and “one or more” tointroduce claim recitations. However, the use of such phrases should notbe construed to imply that the introduction of a claim recitation by theindefinite articles “a” or “an” limits any particular claim containingsuch introduced claim recitation to embodiments containing only one suchrecitation, even when the same claim includes the introductory phrases“one or more” or “at least one” and indefinite articles such as “a” or“an” (e.g., “a” and/or “an” should be interpreted to mean “at least one”or “one or more”); the same holds true for the use of definite articlesused to introduce claim recitations.

In addition, even if a specific number of an introduced claim recitationis explicitly recited, those skilled in the art will recognize that suchrecitation should be interpreted to mean at least the recited number(e.g., the bare recitation of “two recitations,” without othermodifiers, means at least two recitations, or two or more recitations).Furthermore, in those instances where a convention analogous to “atleast one of A, B, and C, etc.” or “one or more of A, B, and C, etc.” isused, in general such a construction is intended to include A alone, Balone, C alone, A and B together, A and C together, B and C together, orA, B, and C together, etc. For example, the use of the term “and/or” isintended to be construed in this manner. Additionally, the terms“about,” “substantially,” and “approximately” should be interpreted tomean a value within 10% of an actual value, for example, values like 3mm or 100% (percent).

Further, any disjunctive word or phrase presenting two or morealternative terms, whether in the description, claims, or drawings,should be understood to contemplate the possibilities of including oneof the terms, either of the terms, or both terms. For example, thephrase “A or B” should be understood to include the possibilities of “A”or “B” or “A and B.”

However, the use of such phrases should not be construed to imply thatthe introduction of a claim recitation by the indefinite articles “a” or“an” limits any particular claim containing such introduced claimrecitation to embodiments containing only one such recitation, even whenthe same claim includes the introductory phrases “one or more” or “atleast one” and indefinite articles such as “a” or “an” (e.g., “a” and/or“an” should be interpreted to mean “at least one” or “one or more”); thesame holds true for the use of definite articles used to introduce claimrecitations.

Additionally, the use of the terms “first,” “second,” “third,” etc., arenot necessarily used herein to connote a specific order or number ofelements. Generally, the terms “first,” “second,” “third,” etc., areused to distinguish between different elements as generic identifiers.Absence a showing that the terms “first,” “second,” “third,” etc.,connote a specific order, these terms should not be understood toconnote a specific order. Furthermore, absence a showing that the terms“first,” “second,” “third,” etc., connote a specific number of elements,these terms should not be understood to connote a specific number ofelements. For example, a first widget may be described as having a firstside and a second widget may be described as having a second side. Theuse of the term “second side” with respect to the second widget may beto distinguish such side of the second widget from the “first side” ofthe first widget and not to connote that the second widget has twosides.

All examples and conditional language recited herein are intended forpedagogical objects to aid the reader in understanding the invention andthe concepts contributed by the inventor to furthering the art, and areto be construed as being without limitation to such specifically recitedexamples and conditions. Although embodiments of the present disclosurehave been described in detail, it should be understood that the variouschanges, substitutions, and alterations could be made hereto withoutdeparting from the spirit and scope of the present disclosure.

What is claimed is:
 1. A method comprising: obtaining a first digitalimage via a detection application, the first digital image a monoscopicimage depicting a setting from a first position of a camera sensorcommunicatively coupled to the detection application; based on the firstdigital image, generating a second digital image that is monoscopic anddepicts the setting from a second position different from the firstposition; and generating a stereoscopic image of the setting thatincludes the first digital image and the second digital image.
 2. Themethod of claim 1, further comprising: obtaining a plurality ofstereoscopic pair images that each includes a first monoscopic image anda second monoscopic image; and sending the plurality of stereoscopicpair images as inputs into a graph-based model.
 3. The method of claim2, further comprising: sending the first digital image as an input intothe graph-based model, wherein the second digital image is output fromthe graph-based model based on one or both of the plurality ofstereoscopic pair images and the first digital image input into thegraph-based model.
 4. The method of claim 1, further comprising:generating a depth map that includes a corresponding pixel for eachpixel in the stereoscopic image, each corresponding pixel in the depthmap representative of relative distance data from the camera sensor foreach respective pixel in the stereoscopic image.
 5. The method of claim4, further comprising: associating a subset of pixels of a total amountof pixels in the depth map as indicative of an object; and based on aportion of the subset of pixels in the depth map associated with theobject, obtaining an actual distance from the camera sensor to theobject in the stereoscopic image using: the relative distance data ofthe portion associated with object; a focal point of the first digitalimage and the second digital image; and a displacement factor betweenthe first digital image and the second digital image.
 6. The method ofclaim 5, further comprising: sending a warning for presentation via thedetection application when the actual distance to the object satisfies afirst threshold distance; or causing, via the detection application, amachine communicatively coupled to the detection application to performa corrective action when the actual distance to the object satisfies asecond threshold distance.
 7. The method of claim 5, wherein obtainingthe actual distance to the object includes determining a correctionvalue that compensates for an offset in distance measurements based onperceived depth in the stereoscopic image.
 8. The method of claim 1,further comprising: determining a presence of an object within thestereoscopic image; and based on image recognition processing of theobject via a graph-based model, classifying the object.
 9. The method ofclaim 1, wherein the first digital image includes a trajectory of amachine.
 10. A system comprising: a display; a processor coupled to thedisplay and configured to direct data to be presented on the display;and at least one non-transitory computer-readable media communicativelycoupled to the processor and configured to store one or moreinstructions that when executed by the processor cause or direct thesystem to perform operations comprising: obtain a first digital imagevia a camera sensor associated with a machine, the first digital image amonoscopic image depicting a first area of a setting from a firstposition of a camera sensor communicatively coupled to the machine;based on the first digital image, generate a second digital image thatis monoscopic and depicts the setting from a second position differentfrom the first position; and generate a stereoscopic image of thesetting that includes the first digital image and the second digitalimage.
 11. The system of claim 10, wherein the operations furthercomprise: generating a depth map that includes a corresponding pixel foreach pixel in the stereoscopic image, each corresponding pixel in thedepth map representative of relative distance data from the camerasensor for each respective pixel in the stereoscopic image.
 12. Thesystem of claim 11, wherein the operations further comprise: associatinga subset of pixels of a total amount of pixels in the depth map asindicative of an object; and based on a portion of the subset of pixelsin the depth map associated with the object, obtaining an actualdistance from the camera sensor to the object in the stereoscopic imageusing: the relative distance data of the portion associated with object;a focal point of the first digital image and the second digital image;and a displacement factor between the first digital image and the seconddigital image.
 13. The system of claim 12, wherein the operationsfurther comprise: sending a warning for presentation at the display viaa detection application when the actual distance to the object satisfiesa first threshold distance; or causing, via the detection application, amachine communicatively coupled to the detection application to performa corrective action when the actual distance to the object satisfies asecond threshold distance.
 14. The system of claim 12, wherein obtainingthe actual distance to the object includes determining a correctionvalue that compensates for an offset in distance measurements based onperceived depth in the stereoscopic image.
 15. The system of claim 10,wherein the operations further comprise: determining a presence of anobject within the stereoscopic image; and based on image recognitionprocessing of the object via a graph-based model, classifying theobject.
 16. The system of claim 10, wherein the first digital imageincludes a trajectory of a machine.
 17. A system comprising: to aprocessor; and at least one non-transitory computer-readable mediacommunicatively coupled to the processor and configured to store one ormore instructions that when executed by the processor cause or directthe system to perform operations comprising: obtain a first digitalimage via a camera sensor associated with a machine, the first digitalimage a monoscopic image depicting a first area of a setting from afirst position of a camera sensor communicatively coupled to themachine; based on the first digital image, generate a second digitalimage that is monoscopic and depicts the setting from a second positiondifferent from the first position; and generating a stereoscopic imageof the setting that includes the first digital image and the seconddigital image.
 18. The system of claim 17, wherein the operationsfurther comprise: generating a depth map that includes a correspondingpixel for each pixel in the stereoscopic image, each corresponding pixelin the depth map representative of relative distance data from thecamera sensor for each respective pixel in the stereoscopic image. 19.The system of claim 18, wherein the operations further comprise:associating a subset of pixels of a total amount of pixels in the depthmap as indicative of an object; and based on a portion of the subset ofpixels in the depth map associated with the object, obtaining an actualdistance from the camera sensor to the object in the stereoscopic imageusing: the relative distance data of the portion associated with object;a focal point of the first digital image and the second digital image;and a displacement factor between the first digital image and the seconddigital image.